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Abstract 

In this paper, we propose a new scalar and shift transform invariant test statistic for 
the high-dimensional two-sample location test. The expectation of our test is exactly 
zero under the null hypothesis. And we allow the dimension could be arbitrary large. 
Theoretical results and simulation comparison show the good performance of our test. 


1 Introduction 

This article is concerned with the two-sample Behrens-Fisher problem in high-dimensional 
settings. Assume that (X,;i, • • • , X mi } for i — 1, 2 are two independent random samples with 
sizes n x and n 2 , from p-variate distributions F(x — //]) and G(x — /x 2 ) located at p-variate 
centers /x : and /x 2 . Denote n = n\ + n 2 . We wish to test 

: H\ = M 2 versus H ] : /x, ^ /x 2 , (1) 

where their covariances 5R and S 2 are unknown. If Si = X 2 , the classic Hotelling’s T 2 
test is a nature choice when the dimension is fixed and small. However, if the dimension 
is larger than the sample sizes, Hotelling’s T 2 test can not work. Recently, many efforts 
have been devoted to construct new test procedure under the high-dimensional settings. A 
nature method is replacing the sample covariance matrix by the identity matrix (Bai and 
Saranadasa 1996, Chen and Qin 2010). However, those test statistics are not scalar-invariant. 
Srivastava and Du (2008) proposed a scalar-transformation-invariant test by replacing the 
sample covariance matrix with its diagonal matrix. And Srivastava, Katayama and Kano 
(2013) extend it to the unequal covariance case. However, the requirement of p is a smaller 
order of n 2 is too restrictive to be used in high-dimensional settings. Feng, et al. (2014) 
propose another scalar-transformation-invariant test which allows the dimension being a 
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smaller order of n 3 . Gregory, et al. (2014) proposed the generalized component test with 
p = o(n 6 ). However, the requirement of p being of the polynomial order of n is too restrictive 
to be used in the “large p small n" situation. Park and Ayyala (2013) also propose a scalar- 
transformation-invariant test which allow the dimension could be arbitrary large. However, 
their test is not shift-invariant. Even each are ratio-consistent 

estimator, the difference between these estimators are not ignorable. After some tedious 
calculation, we can show that 


E(T pa ) = £ 

k =1 


2 n~ 2 ti(o* k - 4,) 2 
~ ( 1 - ^Ikf 


(1 + 0 ( 1 )) 


where the common vector = pL = (pi 1 , • • • , p p ) and n\ jn k. Thus, if the variances 

of the two samples are not all equal and the common vector is very large, E(Tpa ) is not 
zero even under the null hypothesis. To overcome this issue, we propose a novel test statistic 
which is not only scalar-invariant but also shift-invariant. Under the null hypothesis, the 
expectation of our test statistic is exactly zero. There is no bias term in our test statistic. 
In addition, we also do not require the relationship between the dimension and the sample 
sizes. The dimension p can be arbitrary large in this case. The asymptotic normality of the 
proposed test can be derived under some very mild conditions similar to those in Chen and 
Qin (2010). 


The rest of the paper is organized as follow. In section 2, we propose the new test 
statistic and establish its the asymptotic normality. Simulation comparison is conducted in 
Section 3. We provide all the technical details in the appendix. 


2 Our test 


We now propose a new shift and scalar transformation invariant test statistic in the two 
sample test. Define 


T = 

-L n. 


1 


ni(ni - 1 ) n 2 {n 2 - 1 ) 


1 p ill n 1 712 772 / v 


X 2 s k){Xi jk — x 2tk ) 


k= 1 




S^t 


^ 2 /s 2 

^1 k(i,j) + 0+2 k(s,t) 


where 7 = ni/n 2 , is the sample variance of {X\i k }™= x excluding X lik and X\ ]k . So does 

<3j k( st )- Because the numerator (X lik — X 2sk )(Xij k — X 2tk ) is independent of the denominator 

+ ^lk{s,tp thu + 


E 



X2sk ) 1 jk 

* 2 /s 2 

®lk(i,j) k(s,t) 



—E((X uk - X 2sk )(X ljk - X 2tk ))E + 7<72 fe ( S)t) } l ) 

= (ui k - P2k) 2 E + 7^2fc( S ,t)} _1 ) 
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Unlike the three different estimators of a\ k + 7 cr| fc for the three parts of the test statistic 
in Park and Ayyala (2013), we use the leave-two-out sample variance for each numerator. 
Now, E + 7 <r 2 fc(st)} 1 ) * s exac tly same for each numerator. Then, 

p 

E ( T n) = - E 2 kfE ({d 2 fe(ij) + 7^2fc(,,p} _1 ) • 

fc=i 

Thus, under the null hypothesis H 0 , E(T n ) is exactly zero. Furthermore, under the Condi¬ 
tions (C1)-(C3) stated next, we can show that 

E(T n ) =||A(/x 1 — /J - 2 )|| 2 + o(\/var(T n )) 

var(T " ) D tr((ASlA)2) + M<Z- i) tr((AE2A) ' 2) 

+ tr(AS 1 A 2 S 2 A) 1(1 + o(l)). 
ri\n 2 j 

where A = diag {(ofj + 7^Ii)' 1/2 , • • • , {<?% + 7 ^ 2 P )~ 1/2 }- 

To establish the asymptotic normality of T n , we need the following conditions. Assume, 
like Bai and Saranadasa (1996) and Chen and Qin (2010) did, X^-’s come from the following 
multivariate model: 


Xy = TiZij + Hi for j = 1 , ■ ■ ■ , rii, i = 1 , 2 , ( 2 ) 

where each T, is a p x m matrix for some m > p such that TiTj = X,, and {z are 
m-variate independent and identically distributed random vectors such that 

E{Zi) = 0, var(zj) = I m , E(zf l ) = 3 + A, A > 0, E(zf t ) = m 8 e (0, 00 ), , , 

E ( 4 \ 4 2 2 • • • *£) = E ( z ik\)E(4k\) • • • e{ *£), [) 

for a positive integer q such that Yl \=1 a k A 8 and k\ ^ k 2 ■ ■ ■ ^ k q . The data structure 
generates a rich collection of X, from z, : with a given covariance. Additionally, we need the 
following conditions: as n, p —> 00 

(Cl) ni/(ni +n 2 ) -> k e (0,1). 

(C2) tr (AS,;A 2 SjA 2 S;A 2 Sft A) = o(tr 2 {(AXiA + AX 2 A) 2 }) for i,j,l,h = 1 or 2. 

(C3) (Hi - /i 2 ) 7 A 2 S i A 2 (^ 1 - /x 2 ) = o(n _ 1 tr((AXiA + AS 2 A) 2 )), for i = 1,2. ((/x x - 
/i 2 ) T A(/x 1 - /x 2 )) 2 = o(n _ 1 tr((ASiA + AS 2 A) 2 )). 
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The following theorem establishes the asymptotic null distribution of T n . 


Theorem 1 Under Conditions (C1)-(C3), as p,n —>• oo, 


Tg - E(T n ) 
y / var(7 1 „) 



iV(0,l). 


Then, in order to formulate a testing procedure based on Theorem 1, we need to estimate 
the traces terms in var(T n ). Here, we adopt the following ratio-consistent estimators in Feng 
et al. (2014): 

t r ((A5VW) = A- j> sil - x», 2 ) t d; ( ; i , 2 , s , 4 ) (X si , - X Ji4 ) 

Cl s 

x (X-si 3 — x si2 ) D s ^ it j 2 ) j 3 i i 4 )(X S i 1 — X S j 4 ), 


s — 1 , 2 , and 


___ ^ n 1 »U n 2 n 2 2 

tr(ASiA^S 2 A) 2 2 E EE E (X„ x,^D-_^ ji) (x 2i , x 2il ;' 

ni n 2 iljLi2 i37 , i4 


where 


Dl(ii, 12 ,*3,*4) diag(<Jn(j 1 ^ 2 ) j 3 ) j 4 ) T 7^21) 1 *^" 1 ^( 11 , 12 , 13 , 14 ) "b 3^2p) 1 

D2P1 ,» 2 ,* 3 ,* 4 ) diag(< 7 u T 3^21(11,12,13,14)1 1 ®ip "b 3 &2p(i\,i 2 ,13,14)) 1 

D(n,i 2 ,i 3 ,i 4 ) — diag(crn( il i2 ) + 7^21(13,i 4 )> " ' 1 °"ip(ii,i 2 ) + 7 cr 2p(i 3 ,i 4 )p 

and cr 2 fc( - n ... ^ is the s-th sample variance after excluding j — 1 , • • • , /, s = 1 , 2 , / = 2 ,4, 

* 

k — 1, • • • ,p. Through this article, we use ^ to denote summations over distinct indexes. 
For example, in tr((AEiA) 2 ), the summation is over the set {i\ / *2 ^ *3 7 ^ i 4 }, for all 
*i,* 2 ,* 3,*4 € {1, • • • , ni} and P" 1 = n!/(n — m)\. 

As a consequence, a ratio-consistent estimator of var(T n ) under H 0 is 


= v^) = | ni(n ^ 1} tr((AirA) 2 ) + ^3I)tr((A5rA) 2 ) 

4 - 1 

+-tr(AEiA 2 E 2 A) >. 

n\n 2 


This result suggests rejecting H 0 with a level of significance if T n /a n > z a , where z a is the 
upper a quantile of N( 0,1). 
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3 Simulation 


Here we report a simulation study designed to evaluate the performance of our proposed 
test (abbreviated as FS). We compare our tests with the method proposed by Chen and Qin 
(2010) (abbreviated as CQ), and Srivastava, Katayama and Kano (2013) (abbreviated as 
SKK), Park and Ayyala (2013) (abbreviated as PA) under the unequal covariance matrices 
assumption. We consider the following moving average model as Chen and Qin (2010): 


Xijk — PilZij + Pi2Zi(j + \) + • • • + piTiZ^j+Ti- 1 ) + Pij 


for i = 1,2, j = 1, • • • . iij and k = 1, ••• ,p where {Z^} are, respectively, i.i.d. random 
variables. Consider two scenarios for the innovation {Z^k}: (Scenario I) all the {Z^^} 
are from iV(0,1); (Scenario 11) the first half components of {Zijk} p k=1 are from centralized 
Gamma(4,l) so that it has zero mean, and the rest half components are from N(0, 1). 
The coefficients {pu}^ are generated independently from 17(2,3) and are kept fixed once 
generated through our simulations. The correlations among and X VJ i are determined 
by \k — l\ and T % . We choose T\ — 3, and T 2 = 4 to generate different covariances of X,;. 

We examine the empirical sizes and the estimation efficiency of tests. Under the null 
hypothesis, the components of common vector p 1 = p 2 = p 0 = (pi, ■ ■ ■ , p p ) are generated 
from 17(0, A). The sample sizes are n\ = n 2 = 15. First, we consider the impact of dimension. 
We fix A = 10 and consider six dimensions p = 25,50,100,200,400,800. We summarize 
simulation results by using the mean-standard deviation-ratio (MDR) E{T)/ ^/var(T) and 
the variance ratio (VR) var(T)/var(T). Since the explicit form of E(T) and var(T) is difficult 
to calculate, we estimate them by simulation. Figure 1 reports the MDR, VR and empirical 
sizes of these four tests with different dimensions. We observe that MDR and VR of SKK 
test are larger than zero and one when the dimension becomes larger. It is not strange 
because SKK must require the dimension is a smaller order of n 2 . Second, we consider the 
impact of common shifts. We fix the dimension p = 800 and consider five common shifts 
A = 10,20,30,40,50. Figure 2 reports the MDR, VR and empirical sizes of these four tests 
with different common shifts. The MDR and VR of PA test become larger when the common 
shifts is larger. It further demonstrate that PA test is not shift-invariant. In contrast, the 
MDR and VR of our test is approximately zero and one, respectively. And then, we can 
control the empirical size very well. However, the empirical sizes of the other three tests 
deviate from the nominal level in most cases. 

Next, we compare the power of all these tests. Here, we only report the case ri\ = n 2 = 
15, p = 800. For the alternative hypothesis, p 1 = p 0 + p and p 2 = p 0 where p 0 are generated 
as above. We choose p in two scenarios: (Case A) one allocates all of the components of 
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Mean-SD-Ratio 


Variance-Ratio 


Size 



Figure 1: The MDR, VR and empirical sizes of tests with different dimensions. 

equal magnitude to be nonzero; (Case B) the other allocates randomly half of components 
of equal magnitude to be nonzero. To make the power comparable among the configurations 
of Hi, we set rj := ||/x x — ^ 2 l| 2 /\A r (^i) + tr(S|) = 0.15,0.2,0.25,0.3,0.35 throughout the 
simulation. Figure 3 reports the empirical power of these four tests. Under Scenario II, CQ 
is less powerful than the other three tests because it is not scalar-invariant. Furthermore, our 
test performs better than SKK and PA tests in all cases. All these results together suggest 
that the newly proposed FS test is scale and shift invariant and quite efficient and robust in 
testing the equality of locations, and particularly useful when the variances of components 
are not equal and the dimension is ultra-high. 
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Figure 2: The MDR, VR and empirical sizes of tests with different A. 

Appendix: Proof of Theorem 1 


Firstly, after some tedious calculations, we decompose T n into two parts, that is, 

1 1 p n\ n\ 7i2 n 2 


n i( n i - 1) n- 2 (n 2 - 1) 

1 1 


EEEEE 


(-^1 ik -^■2sk)( K ^-ljk X-2tk) 


°lk + 7 <4 


k =1 i^j s^t 

p n\ ni ?22 ri2 

- 1) n 2 (n 2 rijEE EE pi* - - X 2lk ) 

V 7 V ' k= 1 l^j Sytt 


X 


& \ ffc(ij) + ^lk(s,t) a lk + 7 a\ k 


—T n i + T n 2 


7 



























Scenario II Scenario I 


Case A 


Case B 




Case A 


Case B 




Figure 3: The power of tests with different r/ when ni = n 2 = 15, p = 800. 
Then, it is straightforward to see that 


E(T nl ) = ± % = ||A( Ml - m 2 )|| 2 , 

a ik+'l' a 2k 

var (T nl ) = ———tr((AS 1 A) 2 ) + 2 tr((AS 2 A) 2 ) + tr(AS 1 A 2 S 2 A) 

ni(ui-l) n 2 (n 2 - 1) n ± n 2 

+ —(Mi - M 2 ) T A 2 SiA 2 (/ii - /x 2 ) + — (Mi - M 2 ) T A 2 S 2 A 2 ( M i - /x 2 ). 


Lemma 1 Under the same conditions as Theorem 1, as p and n —> oo, 

T nl -E(T nl ) ^ ^ 

V y ar (T nl ) 


This lemma is a direct corollary of Theorem 1 in Chen and Qin (2010). 
















Next, we only need to show that T n2 = o p (^/var(T„i)). Dehne Y ljk = Xij k — p ik , i 
1, 2, j = 1, • • • , rii, k — 1, • • • ,p. 


p n\ n\ n 2 ri 2 


T, 


n2 


■- 7 1 .. , 1 y y yy ytm - Ymtiv - n 

ni ( ni — 1) n 2 (n 2 — 1) “ 


X 


(X 


_ 

ifc(ij) + 7^,0 ^ + 7<4 / 


P n\ n 2 


nin 2 


lfc — P2 k) 


k =1 i=l s=l 


X 


+ — P'2k)* 

k =1 

=i?l + i ?2 + -R .3 


(X 


lfc(i 


_i_ 

?fc(ij) + 7 ^(M) ^ + 7<4 J 

1 _ 1 \ 

,j) + ~1 & lk{s,t) a lk + 7<4 J 


Define A = diagft^^ + ia 2 21{s t) ) D 2 , • • • , (d 2 p(iJ) + 7^,*)) 1/2 }- 


n 2 n 2 


ni ni 


A, 


ZnEE iry—7T E E Y £( A E,o - A 2 ) Y 


M n 2 -1) ““ \ ni ( ni ~ x ) 


ij 


jY* *=1 
ri2 n2 


+ 


wEE TTEE y 77E,,)-A 2 )Y 2i 


ii,(n, - 1) \ " 2 ( 1*2 - 1) 


ni ri2 


£/s S=1 
n\ n 2 


(ni - l)(n 2 - 1) 


—E E (— E E Y 5 (TEo - a 2 )y 2 , 


i=l s=l 


jjti s^t 

By the Theorem 1 in Park and Ayyala (2013), we have 


n i n i 


YS(Af iJ)Sit) - A 2 )Y 1j = 0(n _3 tr((AE 1 A) 2 )) = o(var(T nl )) 


ni(ni - 1) 
1 


j^i i =1 
n 2 n 2 


^lyyyryEE Y 2 s(A(ij iS)t) - A 2 )Y 2t ) = 0(n 3 tr((AE 2 A) 2 )) = o(var(T nl )) 


t^S S= 1 


ni n2 


E E-EE Y^A^^ - A 2 )Y 2t = 0(n- 3 tr(AE 1 A 2 E 2 A)) = o(var(T nl )) 


nin 2 


i— 1 s=l 
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Thus, R\ = o p (y / var(T n i)). Similarly, 
2 


n 2 n 2 ni 


R-2 


n 2 (n 2 - l)(ni - 1) 


— S Y ^( A Lm)- a2 )(Mi - M 2 ) 


t^s s=l 


n i 


i=l 


ni ni n2 


ni(ni - l)(n 2 - 1) 


_ 11 525252 ( n2 X] Y 2s( A (*J>,0 A2 )(^l R 2 ) 


j¥=i *=1 


S=1 


By the proof of Theorem 2 in Park and Ayyala (2013), we have 


n i 


s = 0(n- 2 (Mi - M 2 ) T A 2 EiA 2 (^ - /i 2 )) 


ni 


7=1 

771 


-E Y ^ (A (w, t) - A2 )(^i-M 2 ) 

2 5=1 


= 0(n 2 (/X! - // 2 ) t A 2 S 2 A 2 (ai 1 - /* 2 )) 


By the Condition (C3), we also have i ?2 = o p (\Jvax(T n i)). And E(R\) = 0(n 1 ((/^ 1 — 
H 2 ) r \(fi 1 - /i. 2 )) 2 ) = o(var(7 1 „i)). Thus, we proof that T„ 2 = o p (- v /var(T n i)). 
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